Search CORE

15 research outputs found

Neural Machine Translation with Word Predictions

Author: Chen Jiajun
Dai Xinyu
Huang Shujian
Weng Rongxiang
Zheng Zaixiang
Publication venue
Publication date: 01/01/2017
Field of study

In the encoder-decoder architecture for neural machine translation (NMT), the hidden states of the recurrent structures in the encoder and decoder carry the crucial information about the sentence.These vectors are generated by parameters which are updated by back-propagation of translation errors through time. We argue that propagating errors through the end-to-end recurrent structures are not a direct way of control the hidden vectors. In this paper, we propose to use word predictions as a mechanism for direct supervision. More specifically, we require these vectors to be able to predict the vocabulary in target sentence. Our simple mechanism ensures better representations in the encoder and decoder without using any extra data or annotation. It is also helpful in reducing the target side vocabulary and improving the decoding efficiency. Experiments on Chinese-English and German-English machine translation tasks show BLEU improvements by 4.53 and 1.3, respectivelyComment: Accepted at EMNLP201

arXiv.org e-Print Archive

Crossref

Classification Based on both Attribute Value Weight and Tuple Weight under the Cloud Computing

Author: Tianzhong He
Yifeng Zheng
Zaixiang Huang
Publication venue
Publication date: 05/03/2020
Field of study

In recent years, more and more people pay attention to cloud computing. Users need to deal with magnanimity data in the cloud computing environment. Classification can predict the need of users from large data in the cloud computing environment. Some traditional classification methods frequently adopt the following two ways. One way is to remove instance after it is covered by a rule, another way is to decrease tuple weight of instance after it is covered by a rule. The quality of these traditional classifiers may be not high. As a result, they cannot achieve high classification accuracy in some data. In this paper, we present a new classification approach, called classification based on both attribute value weight and tuple weight (CATW). CATW is distinguished from some traditional classifiers in two aspects. First, CATW uses both attribute value weight and tuple weight. Second, CATW proposes a new measure to select best attribute values and generate high quality classification rule set. Our experimental results indicate that CATW can achieve higher classification accuracy than some traditional classifiers

CiteSeerX

Classification Based on both Attribute Value Weight and Tuple Weight under the Cloud Computing

Author: Tianzhong He
Yifeng Zheng
Zaixiang Huang
Publication venue: Hindawi Limited
Publication date: 01/01/2013
Field of study

Directory of Open Access Journals

DirectQE: Direct Pretraining for Machine Translation Quality Estimation

Author: Chen Jiajun
Cui Qu
Geng Xiang
Huang Guoping
Huang Shujian
Li Jiahuan
Zheng Zaixiang
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 18/05/2021
Field of study

Machine Translation Quality Estimation (QE) is a task of predicting the quality of machine translations without relying on any reference. Recently, the predictor-estimator framework trains the predictor as a feature extractor, which leverages the extra parallel corpora without QE labels, achieving promising QE performance. However, we argue that there are gaps between the predictor and the estimator in both data quality and training objectives, which preclude QE models from benefiting from a large number of parallel corpora more directly. We propose a novel framework called DirectQE that provides a direct pretraining for QE tasks. In DirectQE, a generator is trained to produce pseudo data that is closer to the real QE data, and a detector is pretrained on these data with novel objectives that are akin to the QE task. Experiments on widely used benchmarks show that DirectQE outperforms existing methods, without using any pretraining models such as BERT. We also give extensive analyses showing how fixing the two gaps contributes to our improvements

Association for the Advancement of Artificial Intelligence: AAAI Publications